Study of noise robust voice activity detection based on periodic component to aperiodic component ratio
نویسندگان
چکیده
This paper describes a study of noise robust voice activity detection (VAD) utilizing the periodic component to aperiodic component ratio (PAR). Although environmental sound changes dynamically in the real world, conventional noise robust features for VAD are sensitive to the non-stationarity of noise, which yields variations in the signal to noise ratio, and sometimes requires apriori noise power estimations. To overcome this problem, we adopt the PAR as an acoustic feature for VAD that is insensitive to the non-stationarity of noise. Hearing research also suggests that the decomposition of the periodic and aperiodic components plays an important role in the human auditory system. The proposed method first estimates the PAR of the observed signals with a harmonic filter in the frequency region. Then it detects the presence of target speech signals based on the voice activity likelihood defined in relation to the PAR. The performance of the proposed VAD algorithm was examined by using simulated and real noisy speech data. Comparisons confirmed that the proposed VAD algorithm outperforms the conventional VAD algorithms particularly in the presence of non-stationary noise.
منابع مشابه
Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio
This paper proposes a front-end processing method for automatic speech recognition (ASR) that employs a voice activity detection (VAD) method based on the periodic to aperiodic component ratio (PAR). The proposed VAD method is called PARADE (PAR based Activity DEtection). By considering the powers of the periodic and aperiodic components of the observed signals simultaneously, PARADE can detect...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملA Robust Voice Activity Detection Based on Noise Eigenspace Projection
A robust voice activity detector (VAD) is expected to increase the accuracy of ASR in noisy environments. This study focuses on how to extract robust information for designing a robust VAD. To do so, we construct a noise eigenspace by the principal component analysis of the noise covariance matrix. Projecting noise speech onto the eigenspace, it is found that available information with higher S...
متن کاملAnalysis/synthesis and modification of the speech aperiodic component
The general framework of this paper is speech analysis and synthesis. The speech signal may be separated into two components: (1) a periodic component (which includes the quasi-periodic or voiced sounds produced by regular vocal cord vibrations); (2) an aperiodic component (which includes the non-periodic part of voiced sounds (e.g. fricative noise in /v/j or sound emitted without any vocal cor...
متن کاملPitch-Scaled Analysis based Residual Reconstruction for Speech Analysis and Synthesis
The typical problem in LPC-like vocoder is buzzing sound which is mainly due to the simple pulse train or noise excitation model. One way to improve it is to reconstruct the residual obtained from inverse filtering. So a new parametric representation of speech based on pitch-scaled analysis is proposed in this paper. Pitch-scaled analysis is used to extract the periodic spectrum of residual wit...
متن کامل